BUG: Fix segfault in lib.isnullobj #13764

gfyoung · 2016-07-23T09:54:08Z

Weird segfault arises when you call lib.isnullobj with an array that uses 0-field values to mean None.
Changed input to be a Python object (i.e. no typing), and the segfault went away. Discovered when there were segfaults in printing a DataFrame containing such an array.

Closes #13717.

codecov-io · 2016-07-23T10:18:43Z

Current coverage is 84.56% (diff: 100%)

No coverage report found for master at b60e42b.

Powered by Codecov. Last update b60e42b...cd07769

jreback · 2016-07-23T10:52:57Z

pandas/lib.pyx

@@ -342,7 +342,7 @@ def item_from_zerodim(object val):

 @cython.wraparound(False)
 @cython.boundscheck(False)
-def isnullobj(ndarray[object] arr):
+def isnullobj(arr):


there is another routine isnull right below this

Sigh...will fix too.

jreback · 2016-07-23T10:53:28Z

you need to check if it's iterable now eg s scalar will fail

jreback · 2016-07-23T10:53:56Z

or at least assert that (as this cannot be called with s scalar)

jreback · 2016-07-23T10:54:28Z

pandas/tests/test_lib.py

+        tm.assert_numpy_array_equal(result, expected)
+
+    def test_non_ndarray_inp(self):
+        ind = pd.Index([1, None, 'foo', -5.1, pd.NaT, np.nan])


put all of these tests with the null tests - you will have to find them

Not a very helpful comment tbh. This a lib method, so I don't see why we don't put the tests here and instead go "hunting" for unit tests like this.

because you will now have tests in multiple places for the same thing

some lib tests see segrated

try types/missing

But they're unit tests for two different methods, one of which is a helper for another. Sure they might be similar, but from an organisation perspective, it's easier I think to put the tests next to the module whose methods or functions you are testing. That makes it easier and write tests IMO.

the bottom line is to be logical where test go do they r easy to fine

my point is that tests for this should be in 1 place
u can pick but consolidate the like tests

or these functions not tested anywhere else??

The specific lib functions aren't being tested anywhere, so I intended to put unit tests for them in test_lib.py. The null functions you were referring to are those exposed at the Python layer, not the internal ones (which is what I am fixing) that are written in Cython.

gfyoung · 2016-07-23T11:18:50Z

@jreback : good point about the scalar thing - I think I actually don't need to remove ndarray entirely. If I remove the object dtype specification, that might be sufficient. Testing it right now.

gfyoung · 2016-07-23T18:17:14Z

@jreback : "fixed" all four isnullobj methods and added tests for all of them, and Travis is still passing. Ready to merge if there are no other concerns.

jreback · 2016-07-23T19:20:23Z

lgtm. can you give a perf test and see if anything material (I don't think so, but can't hurt). as this is the null tester for strings.

gfyoung · 2016-07-23T19:39:41Z

What do you mean null tester for strings? What method are you referring to?

jreback · 2016-07-23T19:44:55Z

this is called for null testing on strings

gfyoung · 2016-07-23T19:56:25Z

From where exactly? Are you talking about the pandas.types.missing methods (also, why focus on strings specifically?)

jreback · 2016-07-23T20:02:00Z

look at how isnull works. this is the only 1 of 2 places this is called.

gfyoung · 2016-07-23T20:04:41Z

Okay, that's what I thought. Where should this perf test go in asv_bench? In strings.py I presume?

jreback · 2016-07-23T20:21:26Z

oh, i mean there prob is a benchmark that tells null on strings already, if not see where the null tests are (I think they are all in one place).

gfyoung · 2016-07-23T21:30:56Z

@jreback : I found isnull benchmarks in the DataFrame benchmarks, so I added one for strings there.

gfyoung · 2016-07-23T23:14:45Z

Travis is doing weird things. Putting [ci skip] in the commit until it corrects itself.

Weird segfault arises when you call lib.isnullobj (or any of its equivalents like lib.isnullobj2d) with an array that uses 0-field values to mean None. Changed input to be a Python object (i.e. no typing), and the segfault went away. Closes pandas-devgh-13717. [ci skip]

jreback · 2016-07-24T14:00:50Z

asv_bench/benchmarks/frame_methods.py

+        self.sample = np.array(list(string.ascii_lowercase) +
+                               list(string.ascii_uppercase) +
+                               list(string.whitespace))
+        self.data = np.random.choice(self.choice, (1000, 1000))


pls test, this was not building on asv (choice -> sample)

Whoops, sorry! Was in a bit of a rush when I wrote it.

jreback · 2016-07-24T14:01:31Z

thanks

minor change I made in the asv
also it IS slightly slower, so I guess the emitted c-code is slightly less optimal w/o the object typing (but can't have seg faults of course :)

jreback reviewed Jul 23, 2016
View reviewed changes

gfyoung force-pushed the isnullobj-segfault branch from eb1e8ee to cd07769 Compare July 23, 2016 11:46

jreback added Bug Compat pandas objects compatability with Numpy or Python functions labels Jul 23, 2016

jreback added this to the 0.19.0 milestone Jul 23, 2016

gfyoung force-pushed the isnullobj-segfault branch from cd07769 to 1f90104 Compare July 23, 2016 21:30

gfyoung force-pushed the isnullobj-segfault branch from 1f90104 to b10bbef Compare July 23, 2016 22:02

gfyoung force-pushed the isnullobj-segfault branch from b10bbef to 0338b5d Compare July 23, 2016 23:15

jreback closed this in ee6c0cd Jul 24, 2016

jreback reviewed Jul 24, 2016
View reviewed changes

jreback mentioned this pull request Jul 24, 2016

Seqfault on creation of dataframe with np.empty_like #13717

Closed

gfyoung deleted the isnullobj-segfault branch July 24, 2016 16:13

jreback mentioned this pull request Aug 24, 2016

DataFrame construction breaks python on numpy empty_like #14082

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Fix segfault in lib.isnullobj #13764

BUG: Fix segfault in lib.isnullobj #13764

gfyoung commented Jul 23, 2016 •

edited

Loading

codecov-io commented Jul 23, 2016 •

edited

Loading

jreback Jul 23, 2016

gfyoung Jul 23, 2016 •

edited

Loading

jreback commented Jul 23, 2016

jreback commented Jul 23, 2016

jreback Jul 23, 2016

gfyoung Jul 23, 2016

jreback Jul 23, 2016

gfyoung Jul 23, 2016 •

edited

Loading

jreback Jul 23, 2016

gfyoung Jul 23, 2016

gfyoung commented Jul 23, 2016

gfyoung commented Jul 23, 2016

jreback commented Jul 23, 2016

gfyoung commented Jul 23, 2016

jreback commented Jul 23, 2016

gfyoung commented Jul 23, 2016

jreback commented Jul 23, 2016

gfyoung commented Jul 23, 2016

jreback commented Jul 23, 2016

gfyoung commented Jul 23, 2016

gfyoung commented Jul 23, 2016

jreback Jul 24, 2016

gfyoung Jul 24, 2016

jreback commented Jul 24, 2016

BUG: Fix segfault in lib.isnullobj #13764

BUG: Fix segfault in lib.isnullobj #13764

Conversation

gfyoung commented Jul 23, 2016 • edited Loading

codecov-io commented Jul 23, 2016 • edited Loading

Current coverage is 84.56% (diff: 100%)

Choose a reason for hiding this comment

gfyoung Jul 23, 2016 • edited Loading

Choose a reason for hiding this comment

jreback commented Jul 23, 2016

jreback commented Jul 23, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gfyoung Jul 23, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gfyoung commented Jul 23, 2016

gfyoung commented Jul 23, 2016

jreback commented Jul 23, 2016

gfyoung commented Jul 23, 2016

jreback commented Jul 23, 2016

gfyoung commented Jul 23, 2016

jreback commented Jul 23, 2016

gfyoung commented Jul 23, 2016

jreback commented Jul 23, 2016

gfyoung commented Jul 23, 2016

gfyoung commented Jul 23, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Jul 24, 2016

gfyoung commented Jul 23, 2016 •

edited

Loading

codecov-io commented Jul 23, 2016 •

edited

Loading

gfyoung Jul 23, 2016 •

edited

Loading

gfyoung Jul 23, 2016 •

edited

Loading